Importing database with phenotypic values

The database used here is composed of 39 Assaf ewes in the first lactation which had the estrus synchronized and the lambing performed in a short window in order to avoid variations caused by the lactation stage. These animals were housed in individual pens and were milked twice a day in a 1x10 stall milking parlor (DeLaval). In the first stage, these animals were fed ad libitum a total mixed ratio (TMR) formulated from alfalfa hay (particle size > 4 cm) and concentrates (50:50 forage:concentrate ratio). These ewes had their dry matter intake (DMI) and milk yield measured daily for a period of three weeks after a period of three weeks of adaptation to the TMR. For each animal, the feed intake was estimated daily by weighting the dry matter refused. Additionally, morning and evening milk production was weighted for each animal in order to calculate the daily milk yield. Protein, fat, and lactose content were estimated for each animal as described by Barrio et al. (2022). The changes in the body weight (BW) were calculated by each ewe through the recording of two consecutive days at the beginning and at the end of the FE evaluation.

#Replace the pathway by the folder in your computer

FE.db<-read.table("~/post_doc_Leon/projetos/RFI_sheep_R_calculations/databases_FE/FE_AllMetrics_database_sheep_07_02_23.txt", h=T, sep="\t")

head(FE.db)
##   Animal_ID Treatment meanBW BWC1 days_period  Fat Protein Milk intake
## 1     61472   Control   66.9 -1.0          31 66.2    49.1 2.63  2.610
## 2     61473   Control   73.4  0.0          31 59.8    46.1 2.55  2.714
## 3     61474   Control   66.1  0.6          31 55.3    49.8 1.53  2.462
## 4     61477 Challenge   54.4 -6.0          31 71.7    50.6 2.18  2.307
## 5     61480   Control   66.8  0.0          31 55.4    49.6 2.63  2.985
## 6     61483   Control   77.9  0.0          31 52.8    49.6 3.04  3.156
##   energyTMR DIM
## 1      0.93  23
## 2      0.93  29
## 3      0.93  32
## 4      0.93  31
## 5      0.93  50
## 6      0.93  28

The following columns are observed in the database:

Animal_ID: Animal identification

Treatment: Categorical variable indicating if the animals was subjected or not to a protein restriction (~40% reduction) challenge

meanBW: Average body weight (kg)

BWC1: Body weight change during the experimental period

days_period: Days in the experiment

Fat: Fat yeild (g/kg)

Protein: Protein yield (g/kg)

Milk: Milk yield (kg/d)

intake: Actual dry matter intake

energyTMR: Energy content of the total mixed ratio (TMR)

DIM: Days in milk

Importing required libraries and functions

library(ggplot2)
library(PerformanceAnalytics)
library(plotly)
library(patchwork)
#Defining custom theme for plots in the document
custom_theme<-function(){
  theme(panel.grid.major = element_line(colour = "#d3d3d3"),
        panel.grid.minor = element_blank(),
        panel.border = element_blank(),
        panel.background = element_blank(),
        plot.title = element_text(size = 14, face = "bold"),
        axis.title = element_text(face="bold", size=20),
        axis.text.x = element_text(colour="black", size = 20, angle = 0),
        axis.text.y = element_text(colour="black", size = 20),
        axis.line = element_line(size=0.5, colour = "black"),
        legend.title = element_text(size = 25),
        legend.text = element_text(size=20),
        strip.text.x = element_text(size = 15))
        }

Estimating additional metrics for FEI, FCR and RFI calculations

In order to calculate the FE metrics, some additional values must be estimated.

Body weight change during the period (BWC2)

#Defined as the body weight change during the whole period divided by the days 
#in the period

FE.db$BWC2<-FE.db$BWC1/FE.db$days_period

Metabolic body weight

#Defined as the body weight raised to 0.75

FE.db$MBW<-FE.db$meanBW^0.75

Energy requirements for BW change, UFL/d

#Defined based on INRA equations

FE.db$ER_BWC<-(0.28*(FE.db$BWC2*1000))/100

Energy requirements for maintenance, UFL/d

#Defined based on INRA equations

FE.db$ER_main<-(0.0345*FE.db$MBW)

Mean milk yield, L/d (MY)

FE.db$MY<-FE.db$Milk/1.036

Fat yield, L/d (MY)

FE.db$FY<-FE.db$Fat/1.036

Protein yield, L/d (MY)

FE.db$PY<-FE.db$Protein/1.036

Metabolic body weight change per day

#Defined as the absolute BWC1 raised to 0.75

FE.db$MBWC1<-abs(FE.db$BWC1)^0.75

Metabolic body weight change for the experimental period

#Defined as the absolute BWC2 raised to 0.75

FE.db$MBWC2<-abs(FE.db$BWC2)^0.75

Energy corrected milk (ECM, kg/d) - INRA 2018

#Defined based on INRA equations

FE.db$ECM<-FE.db$MY*(0.0071*FE.db$FY+0.0043*FE.db$PY+0.2224)

Energy requirements for milk yield, UFL/d

#Defined based on INRA equations

FE.db$ER_MY<-0.686*FE.db$ECM

Total energy requirements, UFL/d

#Defined as the sum of the Energy requirements for MY, maintenance, and BWC

FE.db$TER<-rowSums(FE.db[,c("ER_MY","ER_main","ER_BWC")])

Predicted DM intake

#Defined as the ratio between total energy requirements and the energy on TMR

FE.db$Pred_DMI<-FE.db$TER/FE.db$energyTMR

Checking new database

head(FE.db)
##   Animal_ID Treatment meanBW BWC1 days_period  Fat Protein Milk intake
## 1     61472   Control   66.9 -1.0          31 66.2    49.1 2.63  2.610
## 2     61473   Control   73.4  0.0          31 59.8    46.1 2.55  2.714
## 3     61474   Control   66.1  0.6          31 55.3    49.8 1.53  2.462
## 4     61477 Challenge   54.4 -6.0          31 71.7    50.6 2.18  2.307
## 5     61480   Control   66.8  0.0          31 55.4    49.6 2.63  2.985
## 6     61483   Control   77.9  0.0          31 52.8    49.6 3.04  3.156
##   energyTMR DIM        BWC2      MBW      ER_BWC   ER_main       MY       FY
## 1      0.93  23 -0.03225806 23.39212 -0.09032258 0.8070281 2.538610 63.89961
## 2      0.93  29  0.00000000 25.07680  0.00000000 0.8651495 2.461390 57.72201
## 3      0.93  32  0.01935484 23.18201  0.05419355 0.7997794 1.476834 53.37838
## 4      0.93  31 -0.19354839 20.03084 -0.54193548 0.6910640 2.104247 69.20849
## 5      0.93  50  0.00000000 23.36589  0.00000000 0.8061232 2.538610 53.47490
## 6      0.93  28  0.00000000 26.22123  0.00000000 0.9046325 2.934363 50.96525
##         PY     MBWC1      MBWC2      ECM     ER_MY      TER Pred_DMI
## 1 47.39382 1.0000000 0.07611649 2.233674 1.5323003 2.249006 2.418286
## 2 44.49807 0.0000000 0.00000000 2.027122 1.3906056 2.255755 2.425543
## 3 48.06950 0.6817316 0.05189102 1.193408 0.8186778 1.672651 1.798549
## 4 48.84170 3.8336586 0.29180462 1.943903 1.3335172 1.482646 1.594243
## 5 47.87645 0.0000000 0.00000000 2.051046 1.4070175 2.213141 2.379721
## 6 47.87645 0.0000000 0.00000000 2.318505 1.5904942 2.495127 2.682932

Estimating FEI and FCR

After the estimation of the additional metrics above mentioned, the estimation of FCR and FEI it is a straightforward approach.

Feed efficiency index (FEI)

#FEI is defined as Actual dry matter intake subtracted by the predicted dry matter intake

FE.db$FEI<-FE.db$intake-FE.db$Pred_DMI
#Checking distribution of FEI values

ggplot(FE.db, aes(x=FEI)) + 
  geom_histogram(color="black", fill="white") + 
  custom_theme() 

#Checking correlation between FEI and actual intake values

cor_FEI_DMI<-round(cor(FE.db$intake, FE.db$FEI),2)

text_FEI<-paste("Animal ID: ", FE.db$Animal_ID, "\n","DMI: ", 
                round(FE.db$intake,2), "\n" ,
                "FEI: ", round(FE.db$FEI,2),sep="")

FEI_plot<-ggplot(FE.db, aes(y=intake,x=FEI, text=text_FEI)) + 
  geom_point() + 
  custom_theme() + scale_y_continuous(name="DMI") +
  ggtitle(paste("Actual DMI vs FEI (r2= ", cor_FEI_DMI,")",sep=""))

ggplotly(FEI_plot, tooltip = "text")

Feed Conversion Ratio (FCR)

#FCR is defined as Feed intake (kg DM/d) / ECM (kg/d), where ECM is estimated based on the ECM equation from INRA 2018 (for sheep) 

FE.db$FCR<-FE.db$intake/FE.db$ECM
#Checking distribution of FCR values

ggplot(FE.db, aes(x=FCR)) + 
  geom_histogram(color="black", fill="white") + 
  custom_theme() 

#Checking correlation between FCR and actual intake values

cor_FCR_DMI<-round(cor(FE.db$intake, FE.db$FCR),2)

text_FCR<-paste("Animal ID: ", FE.db$Animal_ID, "\n","DMI: ", 
                round(FE.db$intake,2), "\n" ,
                "FCR: ", round(FE.db$FCR,2),sep="")

FCR_plot<-ggplot(FE.db, aes(y=intake,x=FCR, text=text_FCR)) + 
  geom_point() + 
  custom_theme() + scale_y_continuous(name="DMI") +
  ggtitle(paste("Actual DMI vs FCR (r2= ", cor_FCR_DMI,")",sep=""))

ggplotly(FCR_plot, tooltip = "text")

Estimating Residual feed intake (RFI)

The RFI is defined as the subtraction of the actual feed intake by the predicted feed intake. The prediction of feed intake can be performed using different models. Here, for models to predict the feed intake and subsequently the RFI will be shown.

RFI estimation based on the model proposed by Pryce et al. 2015

#A linear model including ECM, MBW, BWC1 and DIM is used to predict the feed intake

model.pryce<-lm(intake ~ ECM + MBW + BWC1 + DIM,data=FE.db)

#Checking model output

summary(model.pryce)
## 
## Call:
## lm(formula = intake ~ ECM + MBW + BWC1 + DIM, data = FE.db)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.35819 -0.08819 -0.00099  0.09000  0.26783 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 0.159225   0.347147   0.459  0.64939    
## ECM         0.503667   0.057933   8.694 3.70e-10 ***
## MBW         0.061756   0.012793   4.827 2.87e-05 ***
## BWC1        0.057814   0.017037   3.393  0.00177 ** 
## DIM         0.004807   0.003190   1.507  0.14102    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1377 on 34 degrees of freedom
## Multiple R-squared:  0.8102, Adjusted R-squared:  0.7879 
## F-statistic: 36.29 on 4 and 34 DF,  p-value: 7.939e-12
#Extracting residuals (RFI)
FE.db$RFI_pryce<-residuals(model.pryce)

head(FE.db$RFI_pryce)
## [1] -0.17160200 -0.15426355  0.08155537  0.12954168  0.10939969  0.07510774
#Checking distribution of RFI_pryce values

ggplot(FE.db, aes(x=RFI_pryce)) + 
  geom_histogram(color="black", fill="white") + 
  custom_theme() 

#Checking the correlation between predicted and actual values

r2_pryce<-round(summary(model.pryce)$adj.r.squared,3)

RFI_pryce_table<-data.frame(DMI=FE.db$intake, predicted_DMI=predict(model.pryce))

text_pryce<-paste("Animal ID: ", FE.db$Animal_ID, "\n","DMI: ", 
                round(RFI_pryce_table$DMI,2), "\n" ,
                "RFI_pryce: ", round(RFI_pryce_table$predicted_DMI,2),sep="")

plot_pryce<-ggplot(RFI_pryce_table, aes(y=DMI,x=predicted_DMI, text=text_pryce)) + 
  geom_point() + 
  custom_theme() + 
  ggtitle(paste("Actual DMI vs predicted DMI by pryce (r2= ", r2_pryce,")",sep=""))

ggplotly(plot_pryce, tooltip = "text")

RFI estimation based on ECM, MBW and MBWC2

#A linear model including ECM, MBW and MBWC2 is used to predict the feed intake

model.RFI2<-lm(intake ~ ECM + MBW + MBWC2,data=FE.db)

#Checking model output

summary(model.RFI2)
## 
## Call:
## lm(formula = intake ~ ECM + MBW + MBWC2, data = FE.db)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.52406 -0.05510  0.00208  0.09454  0.31614 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.40214    0.34773   1.156    0.255    
## ECM          0.43302    0.07291   5.939 9.29e-07 ***
## MBW          0.06804    0.01484   4.584 5.60e-05 ***
## MBWC2       -0.15832    0.42091  -0.376    0.709    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1751 on 35 degrees of freedom
## Multiple R-squared:  0.684,  Adjusted R-squared:  0.657 
## F-statistic: 25.26 on 3 and 35 DF,  p-value: 7.073e-09
#Extracting residuals (RFI)
FE.db$RFI2<-residuals(model.RFI2)

head(FE.db$RFI2)
## [1] -0.33895494 -0.27219170 -0.02603707 -0.25362495  0.10486107 -0.03423561
#Checking distribution of RFI_pryce values

ggplot(FE.db, aes(x=RFI2)) + 
  geom_histogram(color="black", fill="white") + 
  custom_theme()

#Checking the correlation between predicted and actual values

r2_RFI2<-round(summary(model.RFI2)$adj.r.squared,3)

RFI_RFI2_table<-data.frame(DMI=FE.db$intake, predicted_DMI=predict(model.RFI2))

text_RFI2<-paste("Animal ID: ", FE.db$Animal_ID, "\n","DMI: ", 
                round(RFI_RFI2_table$DMI,2), "\n" ,
                "RFI2: ", round(RFI_RFI2_table$predicted_DMI,2),sep="")

plot_RFI2<-ggplot(RFI_RFI2_table, aes(y=DMI,x=predicted_DMI, text=text_RFI2)) + 
  geom_point() + 
  custom_theme() + 
  ggtitle(paste("Actual DMI vs predicted DMI by RFI2 (r2= ", r2_RFI2,")",sep=""))

ggplotly(plot_RFI2, tooltip = "text")

RFI estimation based on ECM, MBW and an interaction between MeanBW and BWC2

#A linear model including ECM, MBW and an interaction term between meanBW and BWC2 is used to predict the feed intake

model.RFI3<-lm(intake ~ ECM + MBW + meanBW*BWC2,data=FE.db)

#Checking model output

summary(model.RFI3)
## 
## Call:
## lm(formula = intake ~ ECM + MBW + meanBW * BWC2, data = FE.db)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.33820 -0.09806  0.01144  0.07646  0.27749 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.62071    3.91880   0.414    0.682    
## ECM          0.52318    0.06219   8.413 1.02e-09 ***
## MBW         -0.14470    0.69578  -0.208    0.837    
## meanBW       0.05270    0.18517   0.285    0.778    
## BWC2        -0.62360    3.81764  -0.163    0.871    
## meanBW:BWC2  0.04517    0.06165   0.733    0.469    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1427 on 33 degrees of freedom
## Multiple R-squared:  0.8022, Adjusted R-squared:  0.7723 
## F-statistic: 26.77 on 5 and 33 DF,  p-value: 1.004e-10
#Extracting residuals (RFI)
FE.db$RFI3<-residuals(model.RFI3)

head(FE.db$RFI3)
## [1] -0.24251485 -0.20657496  0.04239889  0.05597538  0.15213866  0.01144312
#Checking distribution of RFI_pryce values

ggplot(FE.db, aes(x=RFI3)) + 
  geom_histogram(color="black", fill="white") + 
  custom_theme()

#Checking the correlation between predicted and actual values

r2_RFI3<-round(summary(model.RFI3)$adj.r.squared,3)

RFI_RFI3_table<-data.frame(DMI=FE.db$intake, predicted_DMI=predict(model.RFI3))

text_RFI3<-paste("Animal ID: ", FE.db$Animal_ID, "\n","DMI: ", 
                round(RFI_RFI3_table$DMI,2), "\n" ,
                "RFI3: ", round(RFI_RFI3_table$predicted_DMI,2),sep="")

plot_RFI3<-ggplot(RFI_RFI3_table, aes(y=DMI,x=predicted_DMI, text=text_RFI3)) + 
  geom_point() + 
  custom_theme() + 
  ggtitle(paste("Actual DMI vs predicted DMI by pryce (r2= ", r2_RFI3,")",sep=""))

ggplotly(plot_RFI3, tooltip = "text")

Estimating correlation between FE metrics

The Pearson correlation across the FE variables estimated here will be calculated and plotted using the R package PerformanceAnalytics. It is possible to note that, in general, a strong correlation is observed between the variables.

chart.Correlation(FE.db[,c("FEI","FCR", "RFI_pryce", "RFI2", "RFI3")], histogram=TRUE, pch=19)

Creating a final database with FE metrics

Let’s create a final databse containing only the Animal Id and the FE metrics estimated in the last steps.

#Filtering the database created in this tutorial based on the column names
FE.metrics<-FE.db[,c("Animal_ID","Treatment","FEI","FCR", "RFI_pryce", "RFI2", "RFI3")]

head(FE.metrics)
##   Animal_ID Treatment       FEI      FCR   RFI_pryce        RFI2        RFI3
## 1     61472   Control 0.1917141 1.168478 -0.17160200 -0.33895494 -0.24251485
## 2     61473   Control 0.2884569 1.338844 -0.15426355 -0.27219170 -0.20657496
## 3     61474   Control 0.6634509 2.063000  0.08155537 -0.02603707  0.04239889
## 4     61477 Challenge 0.7127573 1.186788  0.12954168 -0.25362495  0.05597538
## 5     61480   Control 0.6052787 1.455355  0.10939969  0.10486107  0.15213866
## 6     61483   Control 0.4730680 1.361222  0.07510774 -0.03423561  0.01144312

This database could be save in your computer using the following code:

#Filtering the database created in this tutorial based on the column names
write.table(FE.metrics, file="Feed_EfficiencyMetrics_smarter.txt", col.names=T, row.names=F, sep="\t", quote=F)

This will create the tab-delimited file “Feed_EfficiencyMetrics_smarter.txt” in your work directory.

Checking the effect of protein restriction over feed efficiency

The animals which composed the database analyzed here were subject to a nutritional protein restriction (NPR). Initially, 40 lamb ewes from a single flock in the northwest region of Castilla y León (Spain) that were transported to the facilities of the IGM in León were fed a standard diet for replacement ewe lambs providing 16% crude protein until three months of age and were subsequently divided into two groups. The two experimental groups were composed of 20 NPR and 20 C animals. To evaluate the impact of feed restriction challenge due to a trade market problem and a shortage of concentrate inputs, the C ewes received the standard diet mentioned above for 64 d; during the same period, the NPR ewes received the same diet without soybean meal (44% reduction in protein intake). The 64-d NPR period in the prepuberal stage was coincident with the allometric growth of the mammary gland. Now, we will evaluate if the feed efficiency metrics estimated here are significantly different between control and challenge groups.

FEI

# Here we will compare the means of FEI between control and challenge groups using a t-test. But first, we need to define if there is variance equality between groups.

var.test(FEI ~ Treatment, data=FE.metrics)
## 
##  F test to compare two variances
## 
## data:  FEI by Treatment
## F = 1.5248, num df = 19, denom df = 18, p-value = 0.3758
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
##  0.5918188 3.8816358
## sample estimates:
## ratio of variances 
##           1.524777
#The result results of the variance equality test (p-value = 0.3758) suggest that the variances are equal between groups. Therefore, we are assuming this in the t-test. 

t.test(FEI ~ Treatment, data=FE.metrics, var.equal = TRUE)
## 
##  Two Sample t-test
## 
## data:  FEI by Treatment
## t = 1.4407, df = 37, p-value = 0.1581
## alternative hypothesis: true difference in means between group Challenge and group Control is not equal to 0
## 95 percent confidence interval:
##  -0.02965722  0.17561563
## sample estimates:
## mean in group Challenge   mean in group Control 
##               0.4925271               0.4195479

FCR

# Here we will compare the means of FCR between control and challenge groups using a t-test. But first, we need to define if there is variance equality between groups.

var.test(FCR ~ Treatment, data=FE.metrics)
## 
##  F test to compare two variances
## 
## data:  FCR by Treatment
## F = 1.0131, num df = 19, denom df = 18, p-value = 0.9814
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
##  0.3932081 2.5789829
## sample estimates:
## ratio of variances 
##           1.013071
#The result results of the variance equality test (p-value = 0.9814) suggest that the variances are equal between groups. Therefore, we are assuming this in the t-test. 

t.test(FCR ~ Treatment, data=FE.metrics, var.equal = TRUE)
## 
##  Two Sample t-test
## 
## data:  FCR by Treatment
## t = -0.14205, df = 37, p-value = 0.8878
## alternative hypothesis: true difference in means between group Challenge and group Control is not equal to 0
## 95 percent confidence interval:
##  -0.1705026  0.1481624
## sample estimates:
## mean in group Challenge   mean in group Control 
##                1.451066                1.462236

RFI pryce

# Here we will compare the means of RFI_pryce between control and challenge groups using a t-test. But first, we need to define if there is variance equality between groups.

var.test(RFI_pryce ~ Treatment, data=FE.metrics)
## 
##  F test to compare two variances
## 
## data:  RFI_pryce by Treatment
## F = 1.1357, num df = 19, denom df = 18, p-value = 0.7907
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
##  0.4408033 2.8911515
## sample estimates:
## ratio of variances 
##           1.135697
#The result results of the variance equality test (p-value = 0.7907) suggest that the variances are equal between groups. Therefore, we are assuming this in the t-test. 

t.test(RFI_pryce ~ Treatment, data=FE.metrics, var.equal = TRUE)
## 
##  Two Sample t-test
## 
## data:  RFI_pryce by Treatment
## t = 1.3434, df = 37, p-value = 0.1873
## alternative hypothesis: true difference in means between group Challenge and group Control is not equal to 0
## 95 percent confidence interval:
##  -0.02819448  0.13913472
## sample estimates:
## mean in group Challenge   mean in group Control 
##              0.02702390             -0.02844622

RFI2

# Here we will compare the means of RFI2 between control and challenge groups using a t-test. But first, we need to define if there is variance equality between groups.

var.test(RFI2 ~ Treatment, data=FE.metrics)
## 
##  F test to compare two variances
## 
## data:  RFI2 by Treatment
## F = 0.49525, num df = 19, denom df = 18, p-value = 0.1377
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
##  0.1922247 1.2607679
## sample estimates:
## ratio of variances 
##          0.4952525
#The result results of the variance equality test (p-value = 0.1377) suggest that the variances are equal between groups. Therefore, we are assuming this in the t-test. 

t.test(RFI2 ~ Treatment, data=FE.metrics, var.equal = TRUE)
## 
##  Two Sample t-test
## 
## data:  RFI2 by Treatment
## t = 0.45002, df = 37, p-value = 0.6553
## alternative hypothesis: true difference in means between group Challenge and group Control is not equal to 0
## 95 percent confidence interval:
##  -0.08576579  0.13474031
## sample estimates:
## mean in group Challenge   mean in group Control 
##              0.01192969             -0.01255757

RFI3

# Here we will compare the means of RFI3 between control and challenge groups using a t-test. But first, we need to define if there is variance equality between groups.

var.test(RFI3 ~ Treatment, data=FE.metrics)
## 
##  F test to compare two variances
## 
## data:  RFI3 by Treatment
## F = 0.87033, num df = 19, denom df = 18, p-value = 0.765
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
##  0.3378039 2.2155967
## sample estimates:
## ratio of variances 
##          0.8703264
#The result results of the variance equality test (p-value = 0.765) suggest that the variances are equal between groups. Therefore, we are assuming this in the t-test. 

t.test(RFI3 ~ Treatment, data=FE.metrics, var.equal = TRUE)
## 
##  Two Sample t-test
## 
## data:  RFI3 by Treatment
## t = 1.6375, df = 37, p-value = 0.11
## alternative hypothesis: true difference in means between group Challenge and group Control is not equal to 0
## 95 percent confidence interval:
##  -0.01620261  0.15271109
## sample estimates:
## mean in group Challenge   mean in group Control 
##              0.03325207             -0.03500218

We can also check the distribution of each value within control and challenge groups using boxplots.

#FEI
FEI_NPR_plot<-ggplot(FE.metrics, aes(y=FEI,x=Treatment, fill=Treatment)) + 
  geom_boxplot() + scale_x_discrete(name = "NPR Groups") +
  theme_bw() + scale_fill_manual(values=c("chartreuse", "brown3")) + 
  ggtitle("FEI")

#FCR
FCR_NPR_plot<-ggplot(FE.metrics, aes(y=FCR,x=Treatment, fill=Treatment)) + 
  geom_boxplot() + scale_x_discrete(name = "NPR Groups") +
  theme_bw() + scale_fill_manual(values=c("chartreuse", "brown3")) + 
  ggtitle("FCR")

#RFI_pryce
RFI_pryce_NPR_plot<-ggplot(FE.metrics, aes(y=RFI_pryce,x=Treatment, fill=Treatment)) + 
  geom_boxplot() + scale_x_discrete(name = "NPR Groups") +
  theme_bw() + scale_fill_manual(values=c("chartreuse", "brown3")) + 
  ggtitle("RFI_pryce")


#RFI2
RFI2_NPR_plot<-ggplot(FE.metrics, aes(y=RFI2,x=Treatment, fill=Treatment)) + 
  geom_boxplot() + scale_x_discrete(name = "NPR Groups") +
  theme_bw() + scale_fill_manual(values=c("chartreuse", "brown3")) + 
  ggtitle("RFI2")

#RFI3
RFI3_NPR_plot<-ggplot(FE.metrics, aes(y=RFI3,x=Treatment, fill=Treatment)) + 
  geom_boxplot() + scale_x_discrete(name = "NPR Groups") +
  theme_bw() + scale_fill_manual(values=c("chartreuse", "brown3")) + 
  ggtitle("RFI3")


(FEI_NPR_plot | FCR_NPR_plot) / (RFI_pryce_NPR_plot | RFI2_NPR_plot | RFI3_NPR_plot)

Additional comment

If we check the summary of the models used here to estimate RFI, we will notice that the intercept is not significant for any model. Consequently, what would happen if we exclude the intercept from our models? But, even more important… Should we remove the intercept from our model in this case?

RFI estimation based on the model proposed by Pryce et al. 2015

#A linear model including ECM, MBW, BWC1 and DIM is used to predict the feed intake

model.pryce_NoInt<-lm(intake ~ 0 + ECM + MBW + BWC1 + DIM,data=FE.db)

#Checking model output

summary(model.pryce_NoInt)
## 
## Call:
## lm(formula = intake ~ 0 + ECM + MBW + BWC1 + DIM, data = FE.db)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.35945 -0.09059  0.00206  0.09164  0.27497 
## 
## Coefficients:
##      Estimate Std. Error t value Pr(>|t|)    
## ECM  0.503955   0.057272   8.799 2.16e-10 ***
## MBW  0.066926   0.005981  11.190 4.13e-13 ***
## BWC1 0.055159   0.015842   3.482  0.00136 ** 
## DIM  0.005724   0.002456   2.331  0.02567 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1361 on 35 degrees of freedom
## Multiple R-squared:  0.9979, Adjusted R-squared:  0.9977 
## F-statistic:  4184 on 4 and 35 DF,  p-value: < 2.2e-16
#Extracting residuals (RFI)
FE.db$RFI_pryce_NoInt<-residuals(model.pryce_NoInt)

head(FE.db$RFI_pryce_NoInt)
## [1] -0.15772512 -0.15188646  0.09281117  0.14026517  0.10135052  0.07240085
#Checking distribution of RFI_pryce_NoInt values

ggplot(FE.db, aes(x=RFI_pryce_NoInt)) + 
  geom_histogram(color="black", fill="white") + 
  custom_theme() 

#Checking the correlation between predicted and actual values

r2_pryce_NoInt<-round(summary(model.pryce)$adj.r.squared,3)

RFI_pryce_NoInt_table<-data.frame(DMI=FE.db$intake, predicted_DMI=predict(model.pryce))

text_pryce_NoInt<-paste("Animal ID: ", FE.db$Animal_ID, "\n","DMI: ", 
                round(RFI_pryce_NoInt_table$DMI,2), "\n" ,
                "RFI_pryce_NoInt: ", 
                round(RFI_pryce_NoInt_table$predicted_DMI,2),sep="")

plot_pryce_NoInt<-ggplot(RFI_pryce_NoInt_table, aes(y=DMI,x=predicted_DMI, text=text_pryce_NoInt)) + 
  geom_point() + 
  custom_theme() + 
  ggtitle(paste("Actual DMI vs predicted DMI by pryce (r2= ", r2_pryce_NoInt,")",sep=""))

ggplotly(plot_pryce_NoInt, tooltip = "text")

RFI estimation based on ECM, MBW and MBWC2

#A linear model including ECM, MBW and MBWC2 is used to predict the feed intake

model.RFI2_NoInt<-lm(intake ~ 0 + ECM + MBW + MBWC2,data=FE.db)

#Checking model output

summary(model.RFI2_NoInt)
## 
## Call:
## lm(formula = intake ~ 0 + ECM + MBW + MBWC2, data = FE.db)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.53295 -0.06949  0.01668  0.08753  0.33648 
## 
## Coefficients:
##        Estimate Std. Error t value Pr(>|t|)    
## ECM    0.454906   0.070744   6.430 1.86e-07 ***
## MBW    0.083409   0.006646  12.551 1.04e-14 ***
## MBWC2 -0.074708   0.416595  -0.179    0.859    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1759 on 36 degrees of freedom
## Multiple R-squared:  0.9964, Adjusted R-squared:  0.9961 
## F-statistic:  3335 on 3 and 36 DF,  p-value: < 2.2e-16
#Extracting residuals (RFI)
FE.db$RFI2_NoInt<-residuals(model.RFI2_NoInt)

head(FE.db$RFI2_NoInt)
## [1] -0.35153773 -0.29977962 -0.01059957 -0.22624461  0.10304205 -0.08578759
#Checking distribution of RFI_pryce values

ggplot(FE.db, aes(x=RFI2_NoInt)) + 
  geom_histogram(color="black", fill="white") + 
  custom_theme()

#Checking the correlation between predicted and actual values

r2_RFI2_NoInt<-round(summary(model.RFI2_NoInt)$adj.r.squared,3)

RFI_RFI2_NoInt_table<-data.frame(DMI=FE.db$intake, predicted_DMI=predict(model.RFI2_NoInt))

text_RFI2_NoInt<-paste("Animal ID: ", FE.db$Animal_ID, "\n","DMI: ", 
                round(RFI_RFI2_NoInt_table$DMI,2), "\n" ,
                "RFI2_NoInt: ", 
                round(RFI_RFI2_NoInt_table$predicted_DMI,2),sep="")

plot_RFI2_NoInt<-ggplot(RFI_RFI2_NoInt_table, aes(y=DMI,x=predicted_DMI, text=text_RFI2_NoInt)) + 
  geom_point() + 
  custom_theme() + 
  ggtitle(paste("Actual DMI vs predicted DMI by RFI2 (r2= ", r2_RFI2_NoInt,")",sep=""))

ggplotly(plot_RFI2_NoInt, tooltip = "text")

RFI estimation based on ECM, MBW and an interaction between MeanBW and BWC2

#A linear model including ECM, MBW and an interaction term between meanBW and BWC2 is used to predict the feed intake

model.RFI3_NoInt<-lm(intake ~ 0 + ECM + MBW + meanBW*BWC2,data=FE.db)

#Checking model output

summary(model.RFI3_NoInt)
## 
## Call:
## lm(formula = intake ~ 0 + ECM + MBW + meanBW * BWC2, data = FE.db)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.34078 -0.08942  0.01195  0.06968  0.27717 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## ECM          0.53103    0.05850   9.078 1.31e-10 ***
## MBW          0.14260    0.03842   3.711 0.000734 ***
## meanBW      -0.02368    0.01313  -1.804 0.080091 .  
## BWC2        -1.03377    3.64135  -0.284 0.778210    
## meanBW:BWC2  0.05177    0.05882   0.880 0.384963    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1409 on 34 degrees of freedom
## Multiple R-squared:  0.9978, Adjusted R-squared:  0.9975 
## F-statistic:  3123 on 5 and 34 DF,  p-value: < 2.2e-16
#Extracting residuals (RFI)
FE.db$RFI3_NoInt<-residuals(model.RFI3_NoInt)

head(FE.db$RFI3_NoInt)
## [1] -0.24907915 -0.20006074  0.04175125  0.05170519  0.14589767  0.03058491
#Checking distribution of RFI_pryce values

ggplot(FE.db, aes(x=RFI3_NoInt)) + 
  geom_histogram(color="black", fill="white") + 
  custom_theme()

#Checking the correlation between predicted and actual values

r2_RFI3_NoInt<-round(summary(model.RFI3_NoInt)$adj.r.squared,3)

RFI_RFI3_NoInt_table<-data.frame(DMI=FE.db$intake, predicted_DMI=predict(model.RFI3_NoInt))

text_RFI3_NoInt<-paste("Animal ID: ", FE.db$Animal_ID, "\n","DMI: ", 
                round(RFI_RFI3_NoInt_table$DMI,2), "\n" ,
                "RFI3_NoInt: ", 
                round(RFI_RFI3_NoInt_table$predicted_DMI,2),sep="")

plot_RFI3_NoInt<-ggplot(RFI_RFI3_NoInt_table, aes(y=DMI,x=predicted_DMI, text=text_RFI3_NoInt)) + 
  geom_point() + 
  custom_theme() + 
  ggtitle(paste("Actual DMI vs predicted DMI by RFI3 (r2= ", r2_RFI3_NoInt,")",sep=""))

ggplotly(plot_RFI3_NoInt, tooltip = "text")

It is posisble to note that much higher adjusted R-squares were obtained for all the models. However, should we trust on those values?

Problems when you remove the intercept

Let’s use the models with higher R-adjusted with intercept and its version without intercept as example. First, we should check the distribution of the residuals obtained in both models.

# Checking the mean and standard deviation of the residuals obtained in each model

table.res<-data.frame(res_pryce=residuals(model.pryce), res_pryce_NoInt=residuals(model.pryce_NoInt))

head(table.res)
##     res_pryce res_pryce_NoInt
## 1 -0.17160200     -0.15772512
## 2 -0.15426355     -0.15188646
## 3  0.08155537      0.09281117
## 4  0.12954168      0.14026517
## 5  0.10939969      0.10135052
## 6  0.07510774      0.07240085
Mean and standard deviation of the residuals obtained in the models based on Pryce et al. (2015) with and without intercept.
Mean SD
RFI Pryce 0e+00 0.1302
RFI Pryce without intercept 6e-04 0.1306
res.pryce<-ggplot(model.pryce, aes(y=.resid,x=.fitted)) + 
  geom_point() + 
  custom_theme() + 
  ggtitle("RFI Pryce et al. (2015)") + 
  custom_theme() + scale_y_continuous(name="Residuals") +
  scale_x_continuous(name="Fitted values") 

res.pryce.NoInt<-ggplot(model.pryce_NoInt, aes(y=.resid,x=.fitted)) + 
  geom_point() + 
  custom_theme() + 
  ggtitle("RFI Pryce et al. (2015) without intercept") + 
  custom_theme() + scale_y_continuous(name="Residuals") +
  scale_x_continuous(name="Fitted values") 
  
res.pryce | res.pryce.NoInt

Conclusion about intercept removal

As we can note, the SD obtained for the residuals in both models are quite similar. However, the mean of the residuals in the model without intercept is not equal zero, which is a requirement for the kind of model used here. Additionally, when we remove the intercept, we force the program to set the this value to be equal zero. Consequently, assuming that our the intercepts of our regression lines pass through the zero, which is not the case of most of the models fitted with real data (in the case of a continuous variable). In summary, to run a model without intercept, in the current case, result in an inflation of the sums of squares (SS) for the model (SSmodel) and residuals (SSresidual), leading to the increase in the R-square values.